52 research outputs found

    Low-latency XPath Query Evaluation on Multi-Core Processors

    Get PDF
    XML and the XPath querying language have become ubiquitous data and querying standards used in many industrial settings and across the World-Wide Web. The high latency of XPath queries over large XML databases remains a problem for many applications. While this latency could be reduced by parallel execution, issues such as work partitioning, memory contention, and load imbalance may diminish the benefits of parallelization. We propose three parallel XPath query engines: Static Work Partitioning, Work Queue, and Producer- Consumer-Hybrid. All three engines attempt to solve the issue of load imbalance while minimizing sequential execution time and overhead. We analyze their performance on sets of synthetic and real-world datasets. Results obtained on two multi-core platforms show that while load-balancing is easily achieved for most synthetic datasets, real-world datasets prove more challenging. Nevertheless, our Producer-Consumer-Hybrid query engine achieves good results across the board (speedup up to 6.31 on an 8-core platform)

    Cost-Optimal Execution of Boolean Query Trees with Shared Streams

    Get PDF
    International audienceThe processing of queries expressed as trees of boolean operators applied to predicates on sensor data streams has several applications in mobile computing. Sensor data must be retrieved from the sensors, which incurs a cost, e.g., an energy expense that depletes the battery of a mobile query processing device. The objective is to determine the order in which predicates should be evaluated so as to shortcut part of the query evaluation and minimize the expected cost. This problem has been studied assuming that each data stream occurs at a single predicate. In this work we remove this assumption since it does not necessarily hold for real-world queries. Our main results are an optimal algorithm for single-level trees and a proof of NP-completeness for DNF trees. For DNF trees, however, we show that there is an optimal predicate evaluation order that corresponds to a depth-first traversal. This result provides inspiration for a class of heuristics. We show that one of these heuristics largely outperforms other sensible heuristics, including a heuristic proposed in previous work

    Cost-Optimal Execution of Trees of Boolean Operators with Shared Streams

    Get PDF
    The processing of queries expressed as trees of boolean operators applied to predicates on sensor data streams has several applications in mobile computing. Sensor data must be retrieved from the sensors to a query processing device, such as a smartphone, over one or more network interfaces. Retrieving a data item incurs a cost, e.g., an energy expense that depletes the smartphone's battery. Since the query tree contains boolean operators, part of the tree can be shortcircuited depending on the retrieved sensor data. An interesting problem is to determine the order in which predicates should be evaluated so as to minimize the expected query processing cost. This problem has been studied in previous work assuming that each data stream occurs in a single predicate. In this work we remove this assumption since it does not necessarily hold for real-world queries. Our main results are an optimal algorithm for single-level trees and a proof of NP-completeness for DNF trees. For DNF trees, however, we show that there is an optimal predicate evaluation order that corresponds to a depth-first traversal. This result provides inspiration for a class of heuristics. We show that one of these heuristics largely outperforms other sensible heuristics, including the one heuristic proposed in previous work for our general version of the query processing problem.Le traitement de requĂȘtes, exprimĂ©es sous forme d'arbres d'opĂ©rateurs boolĂ©ens appliquĂ©s Ă  des prĂ©dicats sur des flux de donnĂ©es de senseurs, a de nombreuses applications dans le domaine du calcul mobile. Les donnĂ©es doivent ĂȘtre transfĂ©rĂ©es des senseurs vers l'appareil de traitement des donnĂ©es, par exemple un {smartphone}. TransfĂ©rer une donnĂ©e induit un coĂ»t, par exemple une consommation Ă©nergĂ©tique qui diminuera la charge de la batterie du smartphone. Comme l'arbre de requĂȘtes contient des opĂ©rateurs boolĂ©ens, des pans de l'arbre peuvent ĂȘtre court-circuitĂ©s en fonction des donnĂ©es rĂ©cupĂ©rĂ©es. Un problĂšme intĂ©ressant est de dĂ©terminer l'ordre dans lequel les prĂ©dicats doivent ĂȘtre Ă©valuĂ©s afin de minimiser l'espĂ©rance du coĂ»t du traitement de la requĂȘte. Ce problĂšme a dĂ©jĂ  Ă©tĂ© Ă©tudiĂ© sous l'hypothĂšse que chaque flux apparaĂźt dans un seul prĂ©dicat. Dans le prĂ©sent travail nous Ă©liminons cette hypothĂšse qui ne correspond pas forcĂ©ment Ă  la rĂ©alitĂ©. Nos principaux rĂ©sultats sont un algorithme optimal pour les arbres avec un seul niveau, et une preuve de NP-complĂ©tude pour les arbres sous forme normale disjonctive. Pour les arbres sous forme normale disjonctive, cependant, nous montrons qu'il existe un ordre optimal d'Ă©valuation des prĂ©dicats qui correspond Ă  un parcours en profondeur d'abord. Ce rĂ©sultat nous sert Ă  concevoir toute une classe d'heuristiques. Nous montrons que l'une de ces heuristiques a de bien meilleurs rĂ©sultats que les autres heuristiques et, entre autres, que la seule heuristique prĂ©cĂ©demment proposĂ©e pour le cadre gĂ©nĂ©ral

    Real time business performance monitoring and analysis using metric network

    Get PDF
    Abstract-Monitoring and analyzing business performance in a continuous manner nowadays is crucial for enterprises to achieve operational excellence, and to better align daily operations with long-term business strategies. To do so, performance measures need to be collected from daily operations and aggregated to construct higher-level Key Performance Indicators (KPIs) in nearly real time. We propose a system called metric network for enterprise-wide business performance monitoring and analysis. A metric network consists of metrics, metric repositories, aggregation agents, and knowledge agents. We describe in details the generic procedure patterns of these metric network entities and their communication pattern. Our loosely coupled design makes it easy to enhance features by adding more metrics and agents. The proposed approach is examined using real metrics on a fictitious scenario

    Efficient Update of Indexes for Dynamically Changing Web Documents

    Get PDF
    The original publication is available at www.springerlink.comRecent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchronized collection is not immediately searchable, because the keyword index is rebuilt from scratch less frequently than the collection can be refreshed. An inverted index is usually used to index documents crawled from the web. Complete index rebuild at high frequency is expensive. Previous work on incremental inverted index updates have been restricted to adding and removing documents. Updating the inverted index for previously indexed documents that have changed has not been addressed. In this paper, we propose an efficient method to update the inverted index for previously indexed documents whose contents have changed. Our method uses the idea of landmarks together with the diff algorithm to significantly reduce the number of postings in the inverted index that need to be updated. Our experiments verify that our landmark-diff method results in significant savings in the number of update operations on the inverted index

    The Case for Cloud-Enabled Mobile Sensing Services

    Get PDF
    Singapore MOE Academic Research Fund Tier

    Energy-efficient collaborative query processing framework for mobile sensing services

    Get PDF
    Ministry of Education, Singapore under its Academic Research Funding Tier
    • 

    corecore